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QQ ' A new concept of (asymptotic) qualitative robustness for plug-in estimators based 

O^l . on identically distributed possibly dependent observations is introduced, and it is 

shown that Hampel's theorem for general metrics d still holds. Since Hampel's 
^H ' theorem assumes the UGC property w.r.t. d, i.e. convergence in probability of the 

^0 , empirical probability measure to the true marginal distribution w.r.t. d uniformly 

(~| ' in the class of all admissible laws on the sample path space, this property is shown 

"t^ . for a large class of strongly mixing laws for three different metrics d. For real- 

valued observations the UGC property is established for both the Kolomogorov 
0-metric and the Levy i/j-metric, and for observations in a general locally compact 
.| . and second countable Hausdorff space the UGC property is established for a certain 

^ I metric generating the tp-weak topology. The key is a new uniform weak LLN for 

lO ' strongly mixing random variables. The latter is of independent interest and relies 

^ , on Rio's maximal inequality. 

• , Keywords: plug-in estimator, qualitative robustness, Hampel's theorem, strong mix- 

^^ I ing, Rio's maximal inequality, function bracket, Kolmogorov (/)-metric, locally com- 

CN I pact and second countable Hausdorff space, ^-weak topology, Levy tp-uietiic, uniform 

Glivenko-Cantelli theorem, uniform weak law of large numbers 

><' 



'Department of Mathematics, Saarland University, Germany, zaehle@math.uni-sb.de 



1. Introduction 

Let Al be a class of probability measures on some measurable space E, and T be a 
mapping (statistical functional) from Ai into a measurable space T. Let (Xj)jgN be a 
sequence of -E-valued random elements (observations) being identically distributed ac- 
cording to ^ € 7W. If m„ = i X^ILi ^^i denotes the empirical distribution of Xi, . . . , X„, 
then T{fhn) can provide a reasonable estimator for T{fi). Informally, the sequence 
{T{fhn)) is qualitatively robust when for large n a small change in /i results only in a 
small change of the law of the estimator T{fhn)- More precisely, given a subset Pi C 7W, 
the sequence of estimators {Tijhn)) is said to be qualitatively T^i-robust at // G Pi if for 
every e > there are some (5 > and no € N such that 

ly&Vi, d{^,v)<5 =^ (i'(law{T(m„)|/i},law{T(m„)jz^}) < e Vn>no, (1) 

where d and d' are metrics on Vi and on the class of all probability measures on T, 
respectively. The basic concept of qualitative robustness was initiated by Hampel [HI 
[17] . but the above version of qualitative robustness is due to Huber [TU]. Ruber's version 
is also called asymptotic robustness (cf. 125]) and differs from the original definition in 
[T6l [T7] in that the right-hand side in ([1]) is not required to hold for all n G N but only 
for all n> riQ for some tiq = uq (e) . For background see also [HI [181 EHl [211 [Ml I2Z! and 
references therein. 

The definition of qualitative robustness stated above was introduced by Hampel and 
Huber to capture the case of independent observations. To capture also the case of 
dependent observations, various authors departed from this definition and considered, 
instead of a metric d on a class of probability measures on E, a metric d^ on a class of 
probability measures on the infinite product space E^ or a sequence of pseudo-metrics 
(dn) on classes of probability measures on E'", n G N; cf. [3 [71 [TUl [T71 [2S] . However, in 
the usual situation where one is interested in the estimation of an aspect T{p) of the 
marginal distribution ^ based on the observations Xi, . . . ,X„, and where the contam- 
inated observations are still identically distributed (according to some i^ close to n), it 
might be also reasonable to retain the original definition. After all, for increasing sample 
size n the impact of the data dependence on the estimation often declines. So one might 
hope that if the dependence structures induced by the "admissible" probability mea- 
sures on E^ (which play the role of the laws of the sequences of identically distributed 
observations) are subject to a common constraint, then the implication ^ still holds for 
the class Vi of the marginal distributions. In other words, under a common constraint 
for the dependence structures, it might be sufficient to ensure a small distance between 
the marginal distributions fi and i^ in order to obtain a small distance between the dis- 
tributions of the plug-in estimators, based on large n, under two "admissible" laws on 
E with marginal distributions fi and u, respectively. 

In this article, we will demonstrate that the latter is in fact true under fairly weak 



constraints for the dependence structures. In Section [2l we adapt Ruber's definition of 
qualitative robustness to the case of dependent observations and estabhsh the analogue 
of Hampel's theorem. Since the latter assumes the UGC property, i.e. convergence 
in probability of the empirical probability measure to the true marginal distribution 
uniformly in the class of all "admissible" laws on E^ (cf. Definition 12.31 below), this 
property will be established for a large class of strongly mixing laws on E^ for three 
different metrics. In Section 13.11 we consider real-valued observations {E = M) and 
verify the UGC property for both the Kolomogorov (/)-metric and the Levy metric. In 
Section 13.21 we assume observations in a general locally compact and second countable 
Hausdorff space E and verify the UGC property for a certain metric generating the 
■(/'-weak topology. For both examples the key is a new uniform weak LLN for strongly 
mixing random variables which is given in the Appendix [A] and which relies on Rio's 
maximal inequality. 

It should be stressed that for the considerations of this article it is essential that the 
definition of qualitative robustness (Definition 12. II below) is in line with Ruber's version 
of qualitative robustness ^IQj, i.e. with asymptotic robustness. Indeed, from Example 
1.18 in [5] it is easily seen that for fixed n, weak dependence can change the distribution 
of an estimator even if the marginal distributions of the observed data are the same. 
In this respect, the intension of this article differs from the objective of the existing 
literature on qualitative robustness for dependent observations [21 El [lOl [T71 [25] where 
the estimators are demanded to be "stable" not only for large but also for small samples 
(and are allowed to be more general than plug- in estimators). The latter notion of 
robustness requires a more sophisticated definition compared to Definition 12.11 below. 
See, for instance, [3,] for an informative discussion on a proper choice for the definition 
of qualitative robustness in this context. 

2. Qualitative robustness and a Hampel theorem 

Let {E,£) be a measurable space, i7 := E^, T := £^, Xi be the ithe coordinate projec- 
tions on Q, and Pj := Fo X^ be the ith. marginal distribution of a probability measures 
¥ on (i7, J^). Let P be a class of probability measures on (i7, J^) such that Pi = P2 = ■ ■ ■ 
for every P € P. Let Vi := {Pi : P € V} be the corresponding class of all marginal 
distributions, and A4 be any subset of the class A4i{E) of all probability measures on 
{E,S) such that Vi C 7W. Let (T,T) be a measurable space, and T : 7W ^- T be a 
mapping (statistical functional). For every n G N, we assume that the mapping 

f„(x) = f„(x(")) := r(m^(„)), x = {xi,X2,...)en (2) 

is {£^, T)-measurable, where m^{„) := ^ X]"=i Sx^ denotes the empirical probability mea- 
sures associated with x^"-' := {xi, . . . , Xn)- For ([2|) to be well defined, we assume that the 
set of all such empirical probability measures is contained in A4. Notice that Tn provides 



an estimator for T{fi). We let d' be some metric on the set A^i(T) of all probability 
measures on (T,T), and d be some metric on Vi- 

Definition 2.1 (Qualitative robustness) Let us take the notation from above, and let 
P € "P. Then the sequence (Tn) of estimators is said to be qualitatively 'P-marginally 
robust at P w.r.t. (d, d') if for every e > there are some (5 > and tiq G N such that 

QeV, d(Pi,Qi)<5 =^ d'(Pof-i,Qof-i) <e Vn>no. 

Sometimes also the functional T itself will be called qualitatively 7^-marginally robust 
at P if the corresponding sequence of plug-in estimators (r„) is. 

Remark 2.2 If every P € "P is an infinite product measure, i.e. P = P^, then Defini- 
tion 12.11 coincides with the classical definition of qualitative robustness for independent 
observations; cf. US [201 [21]. O 



In applications, the validity of qualitative robustness in the sense of Definition 12.11 
is typically hard to check "directly". So it is natural to ask for transparent sufficient 
conditions. In the framework of Remark 12.21 the celebrated Hampel theorem provides 
a sufficient condition for qualitative robustness when 7W = "Pi = Aii{E), E is Polish, 
and d and d' are the Prohorov metrics; cf. [IH \T7\ \W\ [2U1 124) . In [21] . this result was 
extended to more general metric spaces (Pi, d) but still in the framework of Remark 12. 21 
In Theorem 12.41 below we will formulate a version of Hampel's criterion which can also 
be applied to our general setting. As usual, we choose d' as the Prohorov metric. To 
this end, we assume that T is equipped with a complete and separable metric dx and 
that T is the corresponding Borel u-field. Recall that the Prohorov metric is given by 

^Proh(M' ^) '■= iiifl*^ > '■ /"[^] < ^[^"l + ^ fo^ all A G T}, 

where A^ := {t ^ T : inf^g^ d^it, a) < e} is the e-hull of A. Moreover, we set mn{x) := 
fh^(„) for all X G il. In the following definition, which is a generalization of Definition 
2.3 in |21j . the acronym UGC stands for "uniform (weak) Glivenko-Cantelli" . 

Definition 2.3 (UGC property) We say that P admits the UGC property w.r.t. d if 
for every 5 > 

lim sup P[d(m„,Pi) > 5] = 0. (3) 

Theorem 2.4 (Hampel-type theorem) Assume that V admits the UGC property w.r.t. 
d, and letV G P. Then, if the mapping T is continuous atTi w.r.t. {d^d'Y), the sequence 
{Tn) is qualitatively V-marginally robust at P w.r.t. {d,d'-p^^^. 

The proof of Theorem 12.41 can be found in Section [H 



Remark 2.5 Let A4i,cmp denote the space of all empirical probability measures m^(„) 
with X € E and n € N, and recall that we assumed A^i,cmp C A4. It is clear from the 
proof in Section [J] that in Theorem 12.41 it suffices to require that T is A^i^emp-continuous 
at Pi w.r.t. (d, (ix)) meaning that for every e > there is some S > such that for all 
i" G A^i,cmp with d{Fi,u) < <5 we have that dT{T{¥i),T{u)) <e. O 



Remark 2.6 Adapting the proof of Theorem 2.6 in [21], one also obtains a sort of 
converse of Hampel's theorem. More precisely, fix P € P and assume that (T„) is weakly 
consistent w.r.t. dx (i-e. r„ converges in Q-probability to T(Qi) w.r.t. dx) at every 
Q S P for which Qi lies in some neighborhood of Pi w.r.t. d. Then qualitative V- 
marginal robustness of the sequence (T„) at P w.r.t. (d, dp^.^^) implies that the mapping 
T is 7^1-continuous at Pi w.r.t. {d,dT). The latter means that for every e > there is 
some 6 >0 such that dT(T(Pi), r(z^)) < e for every ly eVi with d(Pi, i^) <5. O 

In Section [31 we will give two examples for classes of probability measures on (^2, J-") 
admitting the UGC property. To motivate the Hampel-type Theorem 12.41 we will also 
discuss continuity of particular statistical functionals w.r.t. the involved metrics. 

The classical choice for d in the framework of Definition 12.11 is any metric generating 
the weak topology. The most prominent examples are the Prohorov metric and the Levy 
metric used in [iJl ITTl [T9| [20l ^^ and many further references. Another example is the 
bounded Lipschitz metric used, for instance, in [8l [121 [191 [20] . However, for some pur- 
poses it is somewhat restrictive to use exclusively a metric generating the weak topology. 
The use of such a metric creates a sharp division of the class of statistical functionals T 
into those for which (T„) is "robust" and those for which (T„) is "not robust". Indeed, 
Hampel's theorem says that (T„) is "robust" if and only if T is continuous w.r.t. the weak 
topology. But the distributions of the plug-in estimators of two statistical functionals 
being not continuous w.r.t. the weak topology may react quite different to changes in 
the underlying (marginal) distribution, just as these plug-in estimators may have quite 
different influence functions. For this reason it was proposed in [21] to investigate (T„) 
for qualitative robustness w.r.t. more general metrics, where it is clear that qualitative 
robustness w.r.t. a metric di is a stronger condition than qualitative robustness w.r.t. 
a metric d2 < di. In particular, a statistical functional Ti can be considered to have a 
higher "degree of robustness" than another statistical functional T2 when Ti is qualita- 
tively robust for any choice of d for which T2 is qualitatively robust. In this way, it gets 
possible to differentiate plug-in estimators w.r.t. qualitative robustness within the class 
of statistical functionals that are not weakly continuous. Sensible classes of metrics that 
can be studied in this context are, for instance, {d(^)}(^ and {d^^vagji/i to be introduced 
in ^ and ([7|), respectively, where <J3 and ip (or rather their increases) can be seen as 
gauges for the strictness of the metric and thus for the "degree of robustness". For 



details see [2T1 [22] . The latter reference provides in particular a rigorous quantification 
of the "degree of robustness" for convex risk functionals. 

The preceding discussion counters somewhat the conventional point of view that the 
sequence (T„) of plug- in estimators can be considered to be qualitatively robust exclu- 
sively when T is continuous w.r.t. the weak topology. But even if one insists on the use of 
the weak topology, the considerations below provide new results. For instance, Corollary 
13.71 and Theorem 12.41 together yield a nontrivial generalization of the classical Hampel 
theorem in the form of |19t Theorem 2.21]. In this theorem the underlying metric is the 
Levy metric which generates the weak topology. 

In the sequel we will repeatedly work with left- and rightcontinuous inverses. Recall 
that the leftcontinuous inverse of any nondecreasing function i7 : M — >■ R is defined by 
H^{t) := inf{y € M : H{y) > t} with the convention inf = oo. The rightcontinuous 
inverse H'^ of any nonincreasing function H : M-|_ — t- R4- is defined by H~^{t) := sup{y G 
M-i- : H{y) > i} with the convention sup0 := 0. We will also repeatedly use the notion 
of strong mixing which is recalled in the Appendix [Xj The nth strong mixing coefficient 
of the coordinate process (Xj) under the law P will be denoted by ap(n). 

3. Examples 

3.1. Strong mixing, and Kolmogorov 0-metric or Levy metric 

In this section, we will see that in the case ii^ = M a large class of probability measures on 
{Vl,J-) admits the UGC property w.r.t. (a weighted version of) the Kolmogorov metric. 
As a corollary we will also obtain the UGC property w.r.t. the Levy metric. Let (j) be 
a U-shaped function, i.e. a continuous function </> : M ^ [l;0o) that is nonincreasing on 
(—00,0) and nondecreasing on (0,oo). Then 

d{^){f^,^) ■= sup \F^{y) - F^{y)\(j){y) (4) 

t/GR 

defines a metric on the set A4\ (M) of all probability measures fi on (M, ;B(M)) for which 
d{(j)){lJ-, So) < 00. We will refer to d(<^) as Kolmogorov 4>-metric. Notice that fi G A4{ (M.) 
ii f 4>dfi < 00, and that d(^^^ is just the classical Kolmogorov metric for := 1. 

In [21 [3H] it is demonstrated that many L- and V-functionals T as well as many 
coherent risk functionals T are continuous w.r.t. d(0\ with <j) depending on the partic- 
ular functional T; see also Example 13.41 for some simple examples. So, in view of the 
Hampel-type Theorem 12.41 for a given class V of probability measures on (il, J^), the 
corresponding sequence (r„) of plug-in estimators is qualitatively P-marginally robust 
w.r.t. {d((j,) , dpj.^^) at any F & V ii V admits the UGC property w.r.t. d(0) in the sense 
of Definition 12.31 The following Theorem 13.11 shows that the latter is true if under every 
P G P the coordinates of the coordinate process (Xi) are identically distributed and 



strongly mixing with uniformly (in P E P) decaying mixing coefficients (ap(n)) and if 
the class of marginal distributions Vi is uniformly (^-integrating. It is remarkable that 
the common rate of decay of the mixing coefficients may be arbitrarily slow. 

Theorem 3.1 (UGC property w.r.t. d(0)) Let (j) be a u-shaped function, and V be a 
class of probability measures on {0,,J^) such that 

(a) Pi = P2 = • • • for all FeV, 

(b) limx^ooSuppgp/(?:>(y)l0(j^)>i^Pi((iy) = 0, 

(c) lim„_^oo suppg-p ap(n) = 0. 
Then, for every 6 > 0, 

lim sup P[(i(^)(m„,Pi) > (^] = 0. (5) 

n— >-oo pg-p 

The proof of Theorem 13.11 can be found in Section O Notice that (c) implies in 
particular that the coordinate process (Xj) is strongly mixing under every P G 'P. 

Remark 3.2 (i) Condition (b) is always fulfilled if (p is bounded, in particular if df^\ is 
the classical Kolmogorov metric {4> = 1). 

(ii) Condition (b) is also fulfilled if one can find some function u; : M — >■ [0, 00) such 
that lim|2,|_j.(^ w(x)/(/>(x) = 00 and suppg-p /t(;(y)Pi((iy) < 00; cf. [22l Lemma 4.1]. O 

Remark 3.3 Since the Levy metric dLevy (defined in ^ below) generates the weak 
topology (cf. [121 P- 25]) and the Kolmogorov metric d(i) dominates the Levy metric, i.e. 
^Levy < (^(1) (cf. [13 p. 34]), we obtain that every functional T which is weakly continuous 
at some /j, G A4i(M) is also continuous w.r.t. dn\ at /x. Further notice that qualitative 
robustness w.r.t. ^Levy implies qualitative robustness w.r.t. c?(i). On the other hand, the 
UGC property w.r.t. d^-j implies the UGC property w.r.t. dLevy ^ 

Example 3.4 (i) It is well-known that for fixed a S (0, 1) the lower a-quantile func- 
tional TaifJ-) := F^(a) is continuous w.r.t. the weak topology at /i E A^i(M) when F^ 
is continuous at a; see, for instance, [331 Lemma 21.2]. According to the first part of 
Remark 13.31 ^a is in particular continuous w.r.t. the Kolmogorov metric df^-j at this ;U. 
(ii) The mean functional T'^^^fi) := f y fi{dy) is continuous on M^'^^ w.r.t. di^\ for 
any (p satisfying J l/(j){y) dy < cxd. This follows from the inequality \T^^\fi) — T^^\u)\ < 
f \F^{y) — Fy{y)\dy which holds for all fi,iJ £ A4^'^' . For instance, we may choose 
0(y) = (1 + lyl)^"'''^ foi^ arbitrarily small e > 0. In this case, condition (b) in Theorem 
13.11 holds when suppgp J \y\^~^'^ Fi{dy) < oo for some s' > e; cf. Remark l3.2l (ii). 



(iii) The second moment functional T^'^'{fi) := J y^ fJ-{dy), and thus the variance func- 
tional, is continuous on M.^^' w.r.t. d(0) for any (f) satisfying f \y\/(f){y) dy < oo. This 
follows from the inequality |r''^'*(^) — T^'^'{i')\ < 2 J \Ff^{y) — F,y{y)\ \y\ dy which holds 
for all n,i^ £ A4^'^' . For instance, we may choose (piu) = (1 + |y|)'^^^ for arbitrarily small 
e > 0. In this case, condition (b) in Theorem 13 . 1 1 holds when suppgp / |yp^^ Fi{dy) < oo 
for some e' > e; cf. Remark I3.2l (ii). O 

Example 3.5 (Linear processes) Let {Zs)se7- be a sequence of i.i.d. random variables 
on any probability space, assume that \Zi\ has a finite expectation denoted by L, and 
assume that the distribution of Zi admits a Lebesgue density / for which f \f{y + h) — 
f{y)\dy < M\h\ for all /i € M and some constant M > 0. Define the linear process 
Xf := J2'^o'^sZt-s, t € N, for any real sequence a = {as)s<^No, and let A be the class 
of all real sequences a for which ao = 1 and Yl'^o'^sZt-s is almost surely absolutely 
convergent for every t € N. Results in [26] imply that, if a G vl satisfies Yl'^o ^sZ^ 7^ 
for all z with \z\ < 1, and X^^i Yl's^u 1^*1 ^ °*^' then (X^) is strongly mixing with mix- 
ing coefficients {aa{n)) satisfying aa{n) < {2MLY,T=o\^s{a)\)Y^'^=nY^T=u\(^s\, where 
bs{a) is the coefficient of 2* in the power series expansion of z 1-^ 1/X^^o^*-^*' ^^'^ 
IlT=o\bs{a)\ < 00; cf. the Appendix [B If we denote by P" the law of (Xf) on M^^, 
then, of course, the coordinate process (Xt) on M^ is also strongly mixing under P*^ with 
mixing coefficients (apa(n)) satisfying Qpa(n) < {2ML^'^q l^s(a)|) I]^„I]^„ |asl- 
Now, let A' be any subset of A such that 

(i) '^'^qCLsZ^ 7^ for all z with \z\ < 1, for every a G A', 
(ii) lim„_,oo sup^GA' J2u=n J2T=u |a*l = 0> 
(iii) sup^g^, Xl^o \bsia)\ < 00. 

Then we obtain from the statement above that the class V = V := {P" : a E A'} 
satisfies condition (c) in Theorem 13.11 Moreover, condition (a) in Theorem 13. II is fulfilled 
for V = V' anyway, and condition (b) in Theorem 13.11 is always fulfilled when (f) = \. 
Therefore, Theorem 13.11 shows that V admits the UGC property for the Kolmogorov 
metric di-iy In particular, the Hampel-type Theorem 12 . 4 1 implies that any d(i)-continuous 
functional T is qualitatively "P'-marginally robust at any P° G V' w.r.t. ('i(i), dproh)- 
That is, for every e > there are some 5 > Q and no G N such that 

P'^'gT", d(i)(P?,Pf)<(5 =^ di,„h(P"of-i,P"'of-i)<e Vn>no. 

To get a feeling for the condition on the left, it is appealing to find preferably sharp con- 
ditions on a' = (a3)sgNo and the distribution of Zq under which the distance (i(]i)(Pf,Pf ) 
does not exceed a given 5 > 0. This is an interesting problem on its own, and we will not 
enlarge upon this here. We will only mention that, if X]^o^(i)(/^"si/^a^) < °o with ^as 



and ^a>^ the laws of a^Zg and a^Zo, respectively, then, using the convolution formula, one 
easily obtains the estimate d(i)(P",P" ) < X^^o'^(i)(/^as'/^<) fro™ where on can derive 
some respective conditions. However, there might be more sophisticated approaches. O 



Example 3.6 (ARMA{\, 1) processes) To illustrate conditions (i)"(iii) in Example 13.51 
let us consider for any real (pi^Oi an ARMA(1, 1) process Xf^' ^ = cf)iXf^\ ^ +Zt+OiZt-i 
based on a given sequence {Zs)s^x of square-integrable and centered i.i.d. random vari- 
ables. Moreover, let < c < 1 be arbitrary but fixed. It is discussed in detail in Example 
IB. 41 in the Appendix |B] that if |(/>i|,|0i| < c and (/)i ^ —0i, then the ARMA process 
{Xf^" 1^ Pg^j^ |-,g represented as a linear process Y^'^Qas{4>i,0i)Zt^s with ao{4>i,0i) = 1. 
Letting A'^ := {{as{(j)i,6i))s£No '■ |0i|i I^il < c and 01 7^ —6*1}, Example IB. 41 also shows 
that 

(i) Yl'^o^sz'^ 7^ for all z with \z\ < 1, for every a G A'^, 

(ii) sup.eA^ Eu=n ET=u \^s\ < (2/(1 - c)2) c" for aU n € N, 

(iii) sup^g^^ X;^o \bs{a)\<l + 2c/{l-c). 

Thus, denoting by P'Pi'^i the law of the ARMA(1, 1) process {Xf^' ^) based on the given 
noise (Zg) and with coefficients (j)i,Oi, the discussion in Example 13.51 shows that the class 
■p = V^:= {]p<Pudi . i^^i^ i^^i < c and 4>i 7^ -61} satisfies condition (c) in Theorem [3Tl 
because Vc is nothing but the class of laws ¥"" on M^ of all linear processes X^^o '^sZt~s 
with a€ A'^. O 

Recall from [19| p. 25] that the Levy metric 

dhevyifJ; v) ■= inf{e > : F^{x - e) - e < F^{x) < F^{x + e) + e for ah x G M} (6) 

(with F^,Fy the distribution functions of /u,z^ G A^i(M)) generates the weak topology 
on A^i(R). Since the classical Kolmogorov metric dfi\ dominates the Levy metric (cf. 
|19| p. 34]), we immediately obtain the following corollary to Theorem 13.11 with i;^ := 1, 
taking Remark I3.2l (i) into account. 

Corollary 3.7 (UGC property w.r.t. dLcvy) Let V he a class of probability measures on 
{Q, J-) satisfying conditions (a) and (c) of Theorem \3.1\ Then we have ^ with du) 
replaced by di^^^y. 

Obviously, Example 13.51 remains valid (except the estimate of (i(j-)(P^,P" ) at the end) 
when replacing the Kolmogorov metric dti\ by the Levy metric ^Levy 



3.2. Strong mixing, and ?/;-weak topology 

Let E he a locally compact and second countable Hausdorff space (in this case E is in 
particular a Polish space), £ be the corresponding Borel ir-field, and ip : E ^ [0, oo) be 
a continuous function satisfying tl^ > 1 outside some compact set. Let A^^(-E') be the set 
of all probability measures /i on [E, £) satisfying f ip dfj, < oo, and C^{E) be the space of 
all continuous functions on E for which ||//(l + ^)||oo < oo, where H^Hoo := supj^g^; \9{y)\ 
for any function g : E ^ W. The ip-weak topology on Aii{E) is the coarsest topology 
for which the mappings fi i-7> J/d/i, / G C^{E), are continuous; cf. Section A. 6 in [13]. 
Clearly, the V'-weak topology is finer than the weak topology, and the two topologies 
coincide if and only if tjj is bounded. It follows from Lemma 3.4 (i)44>(iii) in |21] (which 
still holds when replacing M by some Polish space) that the metric 



d^,va.gilJ;J^) ■= dvag(/i,i^)+ / 4^ d/J - / ^ dz^ , IJ,,U eM{{E) (7) 

metrizes the V'-weak topology when dvag metrizes the vague topology on Mi{E). For 
dvag we may and do choose 

oo „ „ 

dvag(M,z^) := J^^{ia| / /fcd/i- / /fedz^l}, fi,iyeMt{E) (8) 

k=i •' •' 

for some countable and || • ||oo-dense subset {fk}ken of the space Cc{E) of all continuous 
functions on E with compact support; cf. the proof of Theorem 31.5 in [Ij. 

Remark 3.8 Any locally compact and second countable Hausdorff space E is cj-compact 
(cf. Example 2 in Section 29 of [IJ), i.e. there exists a sequence {K^) of compact subsets 
of E such that Kn t -^ ^^'^ every compact set K is contained in finally all Kn. So by 
Urysohn's lemma one can find functions e^ € Cc{E), n S N, such that < e„ < 1 and 
en = 1 on Kn for all n G N. If {//I^gn denotes any countable and || • ||oo-dense subset of 
Cc{E), then the set {fk}keN in dH) can be chosen as 

{fk}keN '■= {fl}len U {fl en}l,n£N U {e„}„gN- 

This gets clear from the elaborations in the proof of Theorem 31.5 in [1]. O 

The CO- variance functional T(/i) := J^2{xi—J^x iJ,i{dx)){x2—J^x fi2idx))) iJ,{d{xi,X2)) 
(/Ui and fj,2 denote the marginal distributions of //), for instance, is clearly ^-weakly con- 
tinuous for tp{x) := |xp and E = M?. So, in view of the Hampel-type Theorem 12.41 for a 
given class V of probability measures on (0, T), the corresponding sequence (r„) of plug- 
in estimators, i.e. the sequence of the sample co-variances, is qualitatively 7^-marginally 
robust w.r.t. (d^i,vagi t^Proh) ^^ ^^y ^ ^ V ii V admits the UGC property w.r.t. d,/,,vag 
in the sense of Definition 12.31 The following Theorem 13.91 shows that the latter is true 
under similar conditions as imposed in Theorem 13.11 
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Theorem 3.9 (UGC property w.r.t. c?^,vag) Let E be a locally compact and second 
countable Hausdorff space, 8 be the corresponding Borel a-field, and ijj : E ^ [1, oo) 
he a continuous function. Let V be a class of probability measures on {Vt,J^) such that 

(a) Pi = P2 = • • • for all ¥ eV, 

(b) linii^^oo suppgp / '4^{y)l^{y)>K ^i{dy) = 0, 

(c) lim„_^oo suppg-p ap(n) = 0. 
Then, for every 5 > 0, 

lim sup P[dv,vag(^n,IPi) > <J] = 0. (9) 

The proof of Theorem 13.11 can be found in Section [6l Notice that (c) imphes in 
particular that the coordinate process (Xi) is strongly mixing under every P € "P. 

Remark 3.10 (i) Notice that condition (b) is always fulfilled if ip is bounded, i.e. if 
dip,va.g metrizes the classical weak topology. 

(ii) In the case E = M., condition (b) is fulfilled if there is some t/; : M ^ [0, 00) such 
that lini^^^^^w{x)/(l){x) = 00 and supp^-p f w{y)Fi{dy) < 00; cf. [22J, Lemma 4.1]. O 

Recall from ^ the definition of the Levy metric ^Levy Since (iLevy generates the weak 
topology on 7Wi(M), the metric 



d^,he^y{^Ji,v) := dLevy(Ai, i^) + / ip dfj, - / ip diy , fj,,i^ e M'^{R) (10) 

generates the ip-weak topology on 7^1(1^). The following corollary is an immediate 
consequence of Theorem 13.91 (in fact of (j25p in its proof) and Corollary 13.71 

Corollary 3.11 (UGC property w.r.t. d^,Levy) Let E = W, and V be a class of proba- 
bility measures on (il,/") satisfying conditions (a)-(c) of Theorem \3. 91 Then we have 
(0) with d^^vi^g replaced by (i^,Levy 

Example 3.12 (i) It was already mentioned in Example 13. 4l (i) that for fixed a € (0, 1) 
the lower a-quantile functional is continuous at ^ S A^i(M) w.r.t. the classical weak 
topology, i.e. w.r.t. the ^-weak topology with ip = 1, provided F^ is continuous at a. 

(ii) It is easily seen that the mean functional and the variance functional are continuous 
w.r.t. the t/j-weak topology on A1{'(]R) for ip = \x\ and ip = jxp, respectively. 

(iii) It is demonstrated in [22] that the statistical functional T{fi) = p{Xfj) correspond- 
ing to any law-invariant convex risk measure p defined on an Orlicz space with continuous 
Young function ^ satisfying the A2-condition (meaning that there are C, xq > such 
that ^(22;) < C'^{x) for all x > xq) is continuous w.r.t. the tp-weak topology on TW^ (M) 
for V'(-) = ^(1 • I). O 
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4. Proof of Theorem 12.4 



We adapt the proof of the classical Hampel theorem as given in |191 l20j. For the reader's 
convenience, we first of all recall Strassen's theorem (as formulated in Theorem 2.4.7 in 
|19j : the proof is contained in the seminal paper [32j) whose implication (ii)=^(i) is the 
key for the proof of Theorem [21 



Theorem 4.1 (Strassen) Let T be a Polish space equipped with the corresponding Borel 
a -field T, and dx be any complete and separable metric generating the topology on T. 
Then, for any two probability measures /ii,/i2 on (T,T) and any e, (5 > 0, the following 
two statements are equivalent: 

(i) For every A G T we have 

l^i[A] < fi2[A^] + £, 

where A^ := {t e T : miaeAdT{t,a) < 5}. 
(a) There is some probability measure fi on (T x T,T x T) such that ^ o vrj" = Hi, 



-1 



^ o vTg = /i2, and 



fJ' 



|(ti,t2)eTxT : dT{ti,t2)<6^ 



> 1-e, 



where ttj : T x T ^ T denotes the projection on the ith coordinate, i = 1,2. 

To prove Theorem 12. 4| we have to show that for every e > there are some 6 > and 
no G N such that 

QgV, d{Fi,Qi)<6 =^ d'p,,^(Fof-\Qof-^)<e Vn > no- (11) 
Since 

d'p,,,X^of-\qof-') < d'p,„h(Pof-\5T(p,)) + d'p,,,,{6TiW,),Q°T-') 

(with (55"(p^) the dirac measure on (T,T) with atom T(Pi)), for (fTT]l it suffices to show 
that for every e > there are some 6 > and no € N such that 

QeV, d{FuQi)<6 =^ d'p,,^{5T(F,),Q°T-^)<e/2 Vn > no- (12) 

The remainder of the proof is divided into two steps. In Step 1, we will verify that for 
()12p it suffices to show that for every e > there are some 6 > and no € N such that 



QgV, diFuQi)<6 =^ Q[{xGf]:dT(r(Pi),f„(x)) <|}] >1-| Vn > no. 

(13) 
In Step 2, we will verify (|13p . 
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Step 1. We note that the right-hand side in (fT3|) is equivalent to 



i) 



T-^)) [{ih,t2) E T X T : d^itiM) < |} 



>1-- Vn>no. (14) 



In view of the imphcation (ii)=>(i) in Strassen's Theoreni l4.1l (with fj, := StcPi) '■ 
and e := 5 := e/2), condition (fH|l imphes 

6TiP,) [^] < Q o f -1 [A^/2] + e/2 V A E r, n > no, 



^T-^: 



I.e. 



4roh('5T(Pi), Q o T-1) < e/2 V n > no. 



That is, the right-hand side in ()13p imphes the right-hand side in (|12p . 

S'iep ^. To verify (fT3|) . we pick e > 0. Since T is (d, (iT)-continuous at Pi, we can find 
some (5 > such that for every x G ri and n G N 



d(Pi,m„(x)) < 2(5 



dT(T(Fi),r(m„(x)))<e/2; 



(15) 



recah that the class of all empirical probability measures fhn{x) was assumed to be 
contained in A4. Now, for any Q € P satisfying (i(Pi,Qi) < 6, we obtain with the help 
of the UGC property of V w.r.t. d, of the implication 



n,rhn{x)) < 6 
and of ([IS]) that 

1- 



i,mnix)) [<d(Pi,( 



< 
< 
< 
< 



i,fn„(x))] < 



X €Q : d{Qi,mn{x)) < (5 

X €Q : d{Fi,mn{x)) < d{Fi,Qi) + 6\ 



l,Vl 



) + S 



'} 



X en : d{Fi,mnix)) < 26 

xen : dT{T{Wi),T{mn{x))) < e/2} 

xen : dT(r(Fi),f„(x))<e/2}] 

for all n > riQ and some no = no(e) G N, where no can be chosen independently of Q by 
the UGC property of V. Thus, (fT3|) holds. This completes the proof of Theorem 12.41 



5. Proof of Theorem 13.11 

We will only show that for every 5 > 



lim sup 



sup |Fn(y)-i^p(y)|0(y) > S 

ye(-oo,0] 



0, 



(16) 
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where Fn and Ff denote the empirical distribution function of Xi , . . . , X„ and the dis- 
tribution function of the marginal distribution Pi, respectively. The analogous result for 
the positive real line can be shown in the same way. We will proceed in five steps. For 
every P G P, let the functions wp : [0, 1] -^ [0, oo] and /ip : [0, 1] — )■ [0, oo) be defined by 

w^{t) := 0(Fp^(t))l[o,i.,(o)](t), 

/ip(t) := / wp{s)ds. 
Jo 

Of course, the functions /ip are continuous, nondecreasing and satisfy /ip(0) = for all 
P € ■p. By assumption (b), we also have suppg-p /ip(l) (< suppg-p f (pdFi) < oo. 

Step 1. As the first step, we will show that the family of functions {/ip}pg-p is uniformly 
equicontinuous on [0, 1]. Denote by Jp the set of all y £ (— oo, 0] at which Fp has a jump, 
and by Ip C [0, 1] the range of Fp. Noting that Fp{F^{s)) = s if and only if s G Jp and 
that F-^{s) G Jp for all s G [0, 1] \ Ip (cf. [30, P- 113]), we obtain for any K > 

sup hp(t) 
Per 

= sup cl){F^{s))l[o^Fp{o)At]nir{^)ds 
PeP J 



PeP 



sup / 0(F]^(s))l[o,Fp(O)At]\/p(s)ds 



PeP 



< sup / (/)(FJ^(s))l[o,Fp(o)At]n/p(^p(^l^(■5)))t^•5 



sup V (/>(y) ( {Fp{y) At)- (Fp(y-) A t) 
< sup/ </.(y)l<^(^)>^Pi(d2/) + supi^/ l[o,i](Fp(y))Pi((iy) 



Pgp 



+ sup [ cp{y) l^(y)>K Pi(dy) + sup K V ((Fp(y) At)- {Fp{y-) A t] 

y&Jv 

< sup [<P{y) l<^(j,)>j^Pi((iy) + sup KFi[{z : Fp(z) < t}] 
PeP J PeP 

+ sup / (l){y)l^(^y)>KFi{dy) + iTt 
PeV J 

= 2 sup U{y)l^(^y)>K^i{dy) + 2Kt. 
pgp J 

Now, by assumption (b) we may choose K = Kg, > such that the first summand is 
bounded above by e/2. In particular, hp{t) < e for all t < e/{AKf;) and P G P. That 
is, hp is (right) continuous at uniformly in P G "P. The uniform (right) continuity at 
moreover implies uniform equicontinuity of the family {/ip}pg-p on [0,1], because for 
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every t e [0, 1] and A e M with i + A G [0, 1], 

r-t+(AVO) 



sup|/ip(t) -/ip(t + A)| = sup / (t>{F^ {s))liQ^Fp{o)]{s) ds 

Per FeV Jt+(AAO) 

/■|A| 
< sup / 0(F]^(s))l[o,irp(o)](s)(is 

¥£P Jo 

= sup /ip(|A|), 
Pg-p 

where we used the fact that (/>(F]^(-))l[ojrp(o)](-) is nonincreasing on [0, 1]. 

Step 2. Next, we prepare for Step 3. On the one hand, by the uniform equicontinuity 
of the family {/ipjpgp on the compact interval [0, 1] (cf. Step 1), we can find for every 
e > a finite partition = Sg < sf < • • • < s| =1 (being independent of P) such that 

sup sup / W¥{s) ds i = sup sup (^p(sf) — /ip(-sf_i)) ) < £/2. 

On the other hand, by assumption (b) we may choose a constant K^ > such that 
supp(,p f (l){z)l^(^^)>K,^i{dz) < e/2. Thus, noting 

Wp{y) = sup{s G [0,1] : 0(F^(s))l[o,irp(o)](s) > y} 

< sup{s e [0, 1] : s< Fp((/.^(y)) A Fp(0)} 

< Fp(</.-^(y)) 

for y € (0,oo), and using integration-by-parts (more precisely Theorem 1.15 in [23|), we 
obtain 



roo roc 

sup / w^{y)dy < sup / F^{(j)~^ {y)) dy 



sup / c^{z) ¥i{dz) - K, Fp{ct>-^{K,)) 



< sup / (l){z)¥i{dz) 
PeP J(-oo,(t>^(K^)] 

< sup / (j){z)l^^^)>K,^lidz) 

PeP J (-00,0] 

< e/2, 

where 4'~^{z) := sup{y € (— oo,0] : (/>(?/) > z}, z £ [0,cxd), denotes the rightcontinuous 
inverse of the nonincreasing function (p : (— oo,0] — ?• [l,oo). Taking into account that 
the functions w^" take values only in [0, 1], we can find a finite partition = ^q < yf < 
■ ■ ■ < yf _i < yf = oo (being independent of P), with yf^_]^ = K^, such that 



sup sup / w^{y)dy < e/2, 

Pg-p i=l,...,L Jv" , 



PgP i=l,...,k JyUi 
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i.e., in other words, 

sup sup / ({yl Awp{s))- {yl_i/\w^{s)))ds < e/2. 
FeV i=i,...,k Jo ^ ^ 

Finally, for every P G P, we let = tgp < t^ p < • • • < t^ p = 1 be the partition 
consisting of all points s\ and w'^iy'i)^ where m^^p < /c^ + /g, and A:^ + /^ =: ra^ is 
independent of P. For notational simplicity we assume without loss of generality ra^^ = 
nis for all ¥ eV. 

Step 3. Let L}-{dl) be the space of all Lebesgue integrable functions on [0,1], and 
[l,u\ := {/ G L}-{dV) : / < / < n} be the bracket of two functions Z,« € L^{dl) with / < u 
pointwise. For any e > 0, a bracket [l,u] is called e-bracket if J„ (n — l)dl < e; cf. |341 
p. 83]. Using the notation introduced in Step 2, we set for every P G "P and i = 1, . . . , rris 



<p(-) := «^p(d,p)l[0,tf_i,p](-) + ^pCOlCif^i^p.tyl- 



It follows from the choice of the tfp that, for every P G T', [/^ p, n^ p], . . . , [/^ p, u^ p] 
provide e-brackets in L-^((iI) covering the class £f := {ws,w : s G [0, 1]} of functions 

Ws,w{-) ■■= w;p(s)l[o,s](-)- 

S'iep /^. By the usual quantile transformation |3H p. 103], we can find for every P G P a 
sequence of U[0, l]-random variables C/i,p, U2,p, ■ ■ ■ (possibly on an extension {Qf>,J'f,F) 
of the original probability space (17, J-", P)) such that the sequence (f/i,p) has the same 
mixing coefficients (under P) as the sequence (Xj) under P and such that the correspond- 
ing empirical distribution function Gn,v satisfies Fn = Gn,f o Ff P-almost surely. Here 
we will show as in the proof of Theorem 2.4.1 in [33] that P-almost surely 

sup|F„(y)-Fp(y)|0(y) 
y<o 

< max max| / nf pd(G„,p - I) ; / /• p(i(I - Gn,p) | + e (17) 

j=l,...,m^ Wo ' io ' -■ 

for every e > 0. Since F^ {Ff{y)) < y for all y G M (cf. [30j p. 113]) and (f) is nonincreasing 
on (—00,0], we have P-almost surely 

sup|F„(y) -Fp(y)|(/)(y) = sup |G„,p(Fp(y)) - Fp(y)| (/>(y) 

x<0 y<0 

< sup |G„,p(Fp(y)) - Fp(y)| 0(Fp^(Fp(y))) 



< 


sup 

sG(0,l) 


G„,p(s) -s|i(;p(s) 


= 


sup 

se(o,i) 


/ Ws,pdGn,p - / Ws,FdI 
Jo Jo 
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So for (fTTl) it suffices to show that 





ri 


ri 


sup 

se(o,i) 


/ Ws,F dGn,F - 

Jo 


7o 


< 


max max \ 

= l,...,m^ I J 


n 



{ y nf,p d(G„,p - I) ; / 'f,p rf(I - G„,p)} + e. (18) 

To prove (jlSp . we note that for every s € [0, 1] there is some ig = is(IP) S {1, • • • ,m^} 
such that Ws^F £ [^f p;^f p]; cf- Step 3. Therefore, since [/| .p,tif p] is an e-bracket, 



Ws,F dGn,] 



1 /■! 

Ws,P (il < Ui,P dGn,] 







w, P(il 



^0 

-Ui^ P d{Gn,F - I) + / (-"t P ~ '^s,f) dl 

'0 ^0 

< / <,p d(G„,p - I) + /" «,p - 11^) dl 
Jo Jo 

-1 



< max / nf p(i(G„^p — I) + e. 



1=1,. ..,m^ 

Analogously we obtain 

/ Ws,FdGn,F - / Ws,FdI > -( max / /j^p(i(I - G„,p) + e) . 

Jo Jo \t=l,...,me Jq ' / 

That is, (jlSp and therefore (|17p hold true. 

S'iep 5. Because of (fT7|l . for (fT6]) to be true it suffices to show that for every 6 > 

-1 



lim sup 

n— ^oo p^-p 



max max 

i=l,...,mr 



uf p d{Gn,F - 1) ; 



/ llpd{I-Gn,F)} > 6/2 



with e = e{5) := 5/2. For (fT9|) to be true it suffices to show that 



lim sup 

n— ^oo p^-p 



lim sup 



/fpd(I-G„,p) > 6/2 



ulpd{Gn,F-I) > 6/2 



0, 
0, 



(19) 

(20) 
(21) 



for every i = 1, . . . ,m£, because P is subadditive. We will show only ()2ip . Assertion 

(pop can be shown analogously. Since 

-1 ^ 

nf^P d{Gn,F - I) 

1 " 

= - Yl (^p(*^~i,p)l[o,if_, p](f^i,p) - % ti;p(tf_i^p)l[o,t|_^p](C/i,p) j 

+ ~ IZ (^P(^i.p)l(t|_ip,tlp](^i.p) - %[^p(f^i,p) l(tf_ip,tfp](f^i,p) 
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for ()2ip to be true it suffices to show that for every fixed i € {1, . . . , m^} 

1 " 

n^^ 11^ ^[ I n 55 (^r(Ci,p)l[o,*f„,j(f^^-.r) 

-Ep[t/;p(tf_i^p)l[o,i._^_j(f/i,p)])| > 5/4] = ( 

1 " 

hm sup P \—} (wp(Ui]p)l(te i-E ](Ujv) 



(22) 



(23) 



We will show only ()23p . Assertion ()22p can be shown similarly, noting that the inequality 
w^p(^f_i,p)l[o,t^_jp](t^i,p) < i^p(C^i,p)l[o,tf_ip](f^i,p) holds since w^ is nonincreasing. 

Corollary |A2] ensures ()23p if we can show that the conditions ()26p and ()27p in the corol- 
lary hold for H := V, {VL^,F^,¥^) := (17p,7-p,P) and ej'i := ti;p(f/j,p)l(t|_^^^,t.^](C/,-, 
j £ N, for every fixed i € {1, . . . , m-e}. Condition (I26p follows from 



^J,r;) 



limsup sup Ejp 



|t(0 I -n 



< limsup sup E- 

K^oo PgP 



< limsup sup/ \(j){F^{t))\lu(^FJr{m>Kdt 

K^oo Pg-P Jo 



= limsup sup (l){y)lMy)>K'^i{dy) 

K^oo PG-P J 

and assumption (b). Condition (|27p is an immediate consequence of assumption (c), 
Remark IA.3I and the fact that the sequence (C^j,p) has the same mixing coefficients 
under P as the sequence (Xj) under P. This completes the proof of Theorem 13.11 



6. Proof of Theorem 13.9 



Let (5 > be arbitrary but fixed. Choose A;^ € N such that Y^^=k +1'^ ^ "^ ^1'^^ ^'^'^ 
notice that Q holds if we can show that the following two conditions hold 



lim sup 



lim sup 



fk dfhn - / fk d^i 
if) dfhn — / ^ dPi 



> 



3/c5 

> - 
- 3 



0, k = l,...,k&, (24) 

0. (25) 



To prove ()25p . we note that the left-hand side in (|25p can be written as 

6 



lim supP[|-VV(^i)-Ep[7/.(Xi)] 



> 



18 



The latter is zero by Corollary EJ with U := V, {n^,T^,¥^) := {n,T,¥) and the 
random variables ^j^tt := ipi^i), « G N. Indeed: Assumption ()26p of the corollary holds 
by our assumption (b) and the identity E^[|^i^^|l|^^ ^|>^] = f tpl^^x dFi. Assumption 
(f27|) of the corollary holds by the fact that, under every P G P, the sequence {tp{Xi)) 
has the same mixing coefficients as the sequence (Xj) and by our assumption (c) along 
with Remark IA.3i 

Assertion (|24p can be proven in the same line, noting that V also satisfies condition 
(b) with ip replaced by fk for any k = 1, . . . ,ks; recall that each /^ has compact support. 
This completes the proof of Theorem [ 



A. Uniform weak LLN for strongly mixing random variables 

Let (^j) = (^i)igN be a sequence of random variables on some probability space {il, T, P). 
According to Rosenblatt |29) . the sequence (^j) is said to be strongly mixing (or a-mixing) 
if the nth mixing coefficient ap(n) := sup;j>;^ sup^ ^ \¥[A r\B]— P[A]P[i?]| converges to 
zero as n — ?> oo, where the second supremum ranges over all A G o"(^i, . . . ,^fc) and 
B G a{^k+n-,^k+n+ii- ■ ■)■ Recall from m Inequality (1.9)] that af:{n) < 1/4 for all 
n G N, and use the convention ap(0) := 1/4. For an overview on mixing conditions see 
[H [13| . The following result is a consequence of Theorem 4 and Lemma 1 in [28] ; cf. 
(5.1) on page 936 in [28] . 

Theorem A.l (Rio) Let ^i,^2; ■ ■ ■ be identically distributed with ]E[^i] < oo, and ap(n) 
be defined as above. Let G be the distribution function of \^i\, and set G := 1 — G . Then, 
for every x > and n G N, 



sup 

fc=i ■ 



5](e.-E[ei]) >2x\ <-nJ2 G^itfdt. 



i=i " j=o-'o 



From Theorem lA.ll we can even derive the following uniform weak LLN for strongly 
mixing random variables. In the special case of independent random variables the corol- 
lary is already known from [9]. In fact, [9] provides even a uniform strong LLN. 

Corollary A. 2 Let U y^ 9 be an arbitrary index set. Further, for every tt G 11, let 
(Q^, J-"7r,P7r) be a probability space and Ci,7rj'^2,7r) • • • be a sequence of random variables 
on (Ott, J>,P7r) being identically distributed and strongly mixing with mixing coefficients 
{aT^{n)) := {ap^{n)). Further suppose that the following two conditions hold 



lim supE^[|ei,.|l|5,,|>i^] = 0, (26) 

^ n— 1 

lim sup - V'a,r(i) = 0. (27) 



j=Q 
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Then, for every 6 > 0, 



1 " 

lim sup Ptt - V" (i,7T - IEtt [6,7r] 



> 6 



Remark A. 3 By the Toeplitz lemma, ([771) holds if lim„_>.oo sup^gn OiniiT-) = 0. 



(28) 



O 



Proof (of Corollary I A. 2p We will use a truncation argument. Let ii' > be a constant 
(to be specified later on) and ^f^ := ^j^7rl|5,; ^\<k be the K-truncation of S,i,TT- Using the 
decomposition S,i,TT = ^f^^ + ^i,7rl|g. ^\^k and the triangular inequality, we obtain 



1 "' 
L I n ^-^ 

i=\ 



> 6 



< 



i=l 

I 1 " 



=: 5i(5,n,K,7r)+52((^,n,K,7r) + 53((5,K,7r). 

By Markov's inequality, S2{6, n, K, vr) is bounded above by 36~^ sup^gn 1^7r[|^i,7r|l|5i ,r|>-ft']- 
So, for given e > 0, one can choose K = K^^s in such a way that 

sup S2i6,n,K,Tr) < e/2 Vn e N, 

ttGH 

because we assumed ([26]) . By ([26]) . we may and do also assume that K = K^^s is chosen 
such that 

sup S3{5,K,7t) = 0. 

Tren 



Choosing x := n6/6 in Theorem lA.H we further obtain 

"-1 r2a^{j) 



sup5'i((5,n,is:,7r) < -jr ^^P -^2 G'^vrC*)' 

Tren 0^ Tren ^ f^^Jo 

^ n— 1 



(it 



1152i^ 



i=0' 

2 - "-1 



~X2 '^^P 

(5 Tren n 



3=0 



where Gk,tt denotes the distribution function of |Cf"^|. So, in view of ([27]) . we can find 
some rig £ N such that 



sup Si{6,n,K,Tr) < e/2 \tn>n^. 
Tren 



Thus, dSHD holds. 
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B. Strong mixing of linear processes 

Let {Zs)s£z be i.i.d. random variables on some probability space {0,,J-,¥), and define 
the linear process Xt := X^^o ^sZt-s, t € N, for any real sequence (as)sgNo for which the 
latter series is P-almost surely well defined for every t G N. Moreover, let ap(n) be the 
nth strong mixing coefficient of the sequence {Xt) under P as defined at the beginning 
of Section |X1 The following criterion for {Xt) to be strongly mixing is an immediate 
consequence of Lemmas 2.1 and 2.2 in [26]; notice that we will assume oq = 1. 

Theorem B.l Let aq = 1, and assume that the following assertions hold: 

(a) The distribution of Zi admits a Lebesgue density f for which J \f{y+h)—f{y)\ dy < 
M\h\ for all h £M. and some constant M > 0. 

(b) E[|Zi|]<oo. 

(c) '^'^QagZ^ y^ for all z with \z\ < 1. 
W Eu=iET=u\(^s\ <oo. 

Then, for every n G N, 



oo oo 



ap(n) < [2ME[\Zi\] \^ %\) }_^ }_^ |a,|, (29) 

s=0 u=n s=u 

where bs is the coefficient of z^ in the power series expansion of z ^ ^/YlT=o'^sZ^ ■ In 
particular, {Xt) is strongly mixing. 



It can be seen from Lemma 2.1 in [26] that the right-hand side in ()29p provides an upper 
bound even for the /3-mixing coefficient, i.e. for the mixing coefficient in the context of 
absolute regularity. Notice that (d) implies X^^g I'^^l < °° which, together with (c) and 
Wiener's theorem (cf. [351 P-91] oi [371 p. 301]), ensures that X^^g |^s| < oo- Further 
conditions for a linear process to be strongly mixing can be found in |15t [2U1 136] . 

Remark B.2 (i) Condition (d) is satisfied if [a^l < Cs^^ for all s G Ng and some 
constants C > 0, 7 > 2. In this case, Yl'^=n YlT=u I'^s! < Cyu'^^'^ holds for all n G N. 

(ii) Condition (d) is also satisfied if |as| < Cq^ for all s G Ng and some constants 
C>0,qe{0, 1). In this case, J2u=n J2T=u Ws\ < C{1 - q)-^q'' holds for ah n G N. O 

Example B.3 If a^ = aq^, s > 1, with a 7^ and \q\ < 1, then we have for \z\ < 1 

ao - {{ao - a)q}z 



00 



y^ asz" = {ao - a) + 



.=0 ^-1' 1 - 1' 
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From here one easily derives that 1/ X^s^o ^sZ^ admits the representation Xls^o ^'^^^ with 
60 = l/oo and bg = —a{ao — a)*~^aQ q^, s > 1, using the convention O'' := 1. 

If specifically oq = 1, a = {(pi + di)/^'! and q = (f)! for real numbers (j)i and 9i satisfying 

< |0i| < 1 and l^il < 1, then Y.7=o \bs\ = l + \(t)i + 9i\/{l - |0i|). O 

Example B.4 (ARMA process) To illustrate conditions (c)-(d) in Theorem IB.ll let 
us consider an ARMA(p, q) process Xt = X^s=i 4>sXt-s + Yll=o ^s^t-s with 9^ = 1 
and square-integrable and centered i.i.d. innovations {Zs)s^z- Define the characteristic 
polynomials (/)(z) := 1 — ^^^3=1^3^^ and 9{z) := X]s=o^s'^*' ^^^"^ assume that (j){-) and 
9[-) have no common zeros. Further assume that {Xt) is both causal and invertible. By 
the causality we have that (j){z) 7^ for all complex z with \z\ < 1, and that X admits 
the MA(cx3) representation Xt = ^^QO-sZt-s with the coefficients a^ determined by 



9{z) 



a{z) := X]«^^'; (30) 



s=0 



see [6l Theorem 3.1.1]. By the invertibility we have in addition that 9{z) 7^ 0, and hence 
a{z) 7^ 0, for all complex z with \z\ < 1; see [6l Theorem 3.1.2]. That is, under the 
imposed assumptions the ARMA process can be regarded as a linear process satisfying 
condition (c) in Theorem IB. li 

If specifically p = g = 1, |0i|, [6*11 < 1, and 0i ^ 9i, then the coefficients a^ determined 
by (I2nD read as oq = 1 and a^ = ((/)i + 6'i)(/)f"^ (= ((</)i + 9i)/4>i)(f)l if (pi / 0), s > 1. In 
this case the ARMA(1, 1) process can be seen as a linear process which satisfies not only 
condition (c) but also condition (d) ofTheorem lB.il By part (ii) of Remark lB.2l we have 
in particular Yl'^=n Yl'^u \^s\ = {\4'i + ^i|/(l ~ \'Pi\)'^) \4'i\^^^ for all n € N. Moreover, 
Example IB. 31 vields X^^q I^sI ~ ^ + 1*^1 + ^il/(l ~ l^iDi where bs is the coefficient of z^ 
in the power series expansion of 2; 1— )• 1/ X^s^o ^sZ^- ^ 
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